Personal genetic testing is transforming the biological and medical fields as humans now have access to their genetic instructions. However, every positive such as personalized medicine, assistance in criminal investigations and increase in research are all accompanied by a negative foil. Personalized medicine is a privilege that seems to be given to those in higher socioeconomic classes, criminal investigations are riddled with privacy and consent obstacles, while research has taken advantage of minorities or has greatly catered to those with a European ancestry. There is also high concern of privacy and having such personal data being leaked and placed in the wrong hands.
DNA polymorphisms, most common which are SNPs, can be detected in genomes in various ways. In past decades, methods such as Southern blots, PCRs and hybridization techniques using microarray chips have been used for genome sequencing. Recently, DNA-based molecular markers have been a breakthrough technology that is used to detect SNPs as these markers can easily identify particular DNA sequences.
I was very lucky and had very documented information about my family history on my paternal side, almost no questions were left unanswered. Our documents included birth locations and dates, death locations and deaths, occupation, relocations, any achievements or major life events. My mother’s side, however, is a complete mystery beyond just immediate family.
{width = 65%}
I would expect to share around 50% of my DNA with my parents, 25% with each of my grandparents and 12.5% with each great grandparent and so on. However, it is important to distinguish the difference between genealogical ancestors and genetic ancestors as the later are the ones that I actually got some DNA from. This, of course, is regarding autosomal DNA as the sex chromosomes and maternal DNA are more directly passed down. After about eight generations back is when it is expected that the number of genetic ancestors increases linearly rather than exponentially, while the number of genealogical ancestors only begins to increase exponentially. So starting around the eight generation is where I would begin having ancestors with which I have no genetic similarity.
I have around the expected amount of DNA shared with my aunt (25 % expected and 23.34 % observed) as well as with my first cousin once removed (6.25 % expected and 5.17 % observed). I share 2.5 % and 2.11 % with two of my second cousins and it is expected I share 3.13 %. I share 1.78 %, 1.58 % and 1.14 % with three of my third cousins and it is expected I share 0.78 %.
My ethnicity results were exactly what I predicted. I already knew from my paternal side that I am 50 % Ashkenazi Jewish as my paternal side of the family is 100% Ashkenazi Jewish. My maternal side of the family could potentially have had some surprises, but I got results that I expected. My 47 % Eastern European is all from my mother who believes that her family is Don Cossack, and I even expected the 2 % East Asian and Native American results as Russian history, and thus genealogy, was heavily influenced by the Mongol Empire that ruled over Russia in the 13th and 14th centuries. The German and French influence is most likely from my paternal side as throughout Jewish migration through Europe, many originally stayed in Germany. There was also 0.2 % Central Asian (Kazakhstan, Uzbekistan, Turkmenistan, etc.) trace DNA as well as 0.2 % North African and West Asian DNA. The Central Asian DNA either also came from Mongol rule over Russia, or a potential modern explanation could be that due the USSR which included the Central Asian region. I am not sure where the 0.4 % trace North African DNA comes in. I could further test my theories by looking at DNA similarities between me and some of the people 23andMe listed as possible relatives, which I did. By looking at my Aunt from my paternal side I could confirm that my Central Asian DNA mostly likely came from my Mother as my Aunt had no Central Asian DNA. She did, however, have 1 % North African and West Asian DNA which means that that those results are most likely from my Jewish Ancestors.
The maternal and paternal haplogroups offer approximated ancestry information from ten to hundred thousand years ago. This is because both mitochondrial DNA and Y chromosome DNA have a slower mutation rate and therefore are generally conserved. Therefore, any mutations are significant and can be traced back to hundreds of generations. Through family members that have also taken a 23andMe test I was able to gather more information about my mitochondrial and Y chromosome results.
Mother: Mitochondrial: H1u Y chromosome: ?
Father: Mitochondrial: W3 Y chromosome: E-L29
My mitochondrial haplogroup is H1u, while if I had a Y chromosome, my haplogroup would be E-L29. After further research, I discovered that it is proposed that the H1u lineage split off from other H groups around modern-day Azerbaijan, general Caucus area. This lines up with my Mother’s Don Cossacks ancestry as Don Cossacks are believed to have originated in the North Caucuses. The most common maternal haplogroup for Ashkenazi Jews is K, so it is interesting that my paternal maternal haplogroup is W which is most common in Pakistan and Northern Indian. I found some research that believes that W3 originated in the Middle East but spread to Europe around 15,000 years ago and spans across regions of Russia, to North Africa, Caucasus, the Near East, Mongolia and the Indian Subcontinent. E-L29 also originated in the Middle East about 4,000 years ago and is extremely common in Ashkenazi Jews.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.0 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(DT)
library(gwascat)
## gwascat loaded. Use makeCurrentGwascat() to extract current image.
## from EBI. The data folder of this package has some legacy extracts.
# Load Files
mySNPs <- read_tsv("data/genome_A_G_v5_Full_20210322120133.txt", comment = '#',
col_types =
cols(
rsid = col_character(),
chromosome = col_factor(),
position = col_integer(),
genotype = col_factor()
))
updated_gwas_data <- as.data.frame(makeCurrentGwascat())
## running read.delim on http://www.ebi.ac.uk/gwas/api/search/downloads/alternative...
## formatting gwaswloc instance...
## NOTE: input data had non-ASCII characters replaced by '*'.
## Warning in which(!is.na(as.numeric(df$CHR_POS))): NAs introduced by coercion
## Warning in gwdf2GRanges(tab, extractDate = as.character(Sys.Date())): NAs
## introduced by coercion
## done.
max(updated_gwas_data$DATE.ADDED.TO.CATALOG)
## [1] "2021-04-16"
last_update <- max(updated_gwas_data$DATE.ADDED.TO.CATALOG)
filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(STUDY) %>% distinct()
## STUDY
## 1 Low-frequency variation near common germline susceptibility loci are associated with risk of Ewing sarcoma.
## 2 The Genetics of Circulating Resistin Level, A Biomarker for Cardiovascular Diseases, Is Informed by Mendelian Randomization and the Unique Characteristics of African Genomes.
## 3 Genetic Architecture of Abdominal Aortic Aneurysm in the Million Veteran Program.
## 4 A genome-wide association study on fish consumption in a Japanese population-the Japan Multi-Institutional Collaborative Cohort study.
## 5 GWAS of peptic ulcer disease implicates Helicobacter pylori infection, other gastrointestinal disorders and depression.
## 6 Genetic basis of lacunar stroke: a pooled analysis of individual patient data and genome-wide association studies.
filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(LINK) %>% distinct()
## LINK
## 1 www.ncbi.nlm.nih.gov/pubmed/32881892
## 2 www.ncbi.nlm.nih.gov/pubmed/32876488
## 3 www.ncbi.nlm.nih.gov/pubmed/32981348
## 4 www.ncbi.nlm.nih.gov/pubmed/32895509
## 5 www.ncbi.nlm.nih.gov/pubmed/33608531
## 6 www.ncbi.nlm.nih.gov/pubmed/33773637
mySNPs_gwas_table <- inner_join(mySNPs, updated_gwas_data, by = c("rsid" = "SNPS"))
mySNPs_gwas_table$risk_allele_clean <- str_sub(mySNPs_gwas_table$STRONGEST.SNP.RISK.ALLELE, -1)
mySNPs_gwas_table$my_allele_1 <- str_sub(mySNPs_gwas_table$genotype, 1, 1)
mySNPs_gwas_table$my_allele_2 <- str_sub(mySNPs_gwas_table$genotype, 2, 2)
mySNPs_gwas_table$have_risk_allele_count <- if_else(mySNPs_gwas_table$my_allele_1 == mySNPs_gwas_table$risk_allele_clean, 1, 0) + if_else(mySNPs_gwas_table$my_allele_2 == mySNPs_gwas_table$risk_allele_clean, 1, 0)
There are three medical concerns that I wanted to investigate with my SNPs. Asthma, type-2 diabetes and Crohns/ associated IBD or IBS SNPS
I have a family history of type-2 diabetes. I wanted to look more into it to see what SNPs I have asscoaited with diabetes risk.
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "diabetes")) %>%
distinct()
## # A tibble: 367 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs121169… Type 2 diabetes G AG
## 2 rs171061… Type 2 diabetes G GG
## 3 rs121401… Type 2 diabetes G GG
## 4 rs602633 Coronary heart disease x type 2 diabete… T GT
## 5 rs2282456 Type 2 diabetes G GG
## 6 rs6032 Macrovascular complications in type 2 d… T TT
## 7 rs4077468 Cystic fibrosis-related diabetes A AA
## 8 rs3024505 Type 1 diabetes G GG
## 9 rs340874 Type 2 diabetes C CT
## 10 rs340874 Type 2 diabetes T CT
## # … with 357 more rows
filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12255372 10 114808902 GG
No linked to type-2 diabetes and breast cancer
filter(mySNPs, rsid == "rs4402960")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4402960 3 185511687 GG
No linked to type-2 diabetes
filter(mySNPs, rsid == "rs7754840")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs7754840 6 20661250 CG
No linked to type-2 diabetes
filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12255372 10 114808902 GG
no increased risk of T2D
Carrying two copies of a common variant of TCF7L2 doubles your chances of developing diabetes and puts you in a similar risk category to being clinically obese
filter(mySNPs, rsid == "rs7903146")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs7903146 10 114758349 CC
Normal (lower) risk of Type 2 Diabetes and Gestational Diabetes.
filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12255372 10 114808902 GG
no increased risk of T2D
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "celiac")) %>%
distinct()
## # A tibble: 17 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs13003464 Celiac disease G AG
## 2 rs10188217 Crohn's disease and celiac disease C CT
## 3 rs7574865 Celiac disease or Rheumatoid arthritis T GT
## 4 rs4678523 Celiac disease C CT
## 5 rs11712165 Celiac disease G GT
## 6 rs6822844 Celiac disease G GG
## 7 rs424232 Celiac disease C CC
## 8 rs10806425 Celiac disease A AC
## 9 rs2041570 Refractory celiac disease type II A AG
## 10 rs11984075 Celiac disease or Rheumatoid arthritis G AG
## 11 rs1953126 Celiac disease or Rheumatoid arthritis T CT
## 12 rs1250552 Celiac disease A AG
## 13 rs7104791 Celiac disease T CT
## 14 rs3184504 Celiac disease C CT
## 15 rs653178 Celiac disease or Rheumatoid arthritis C CT
## 16 rs2664156 Celiac disease C CC
## 17 rs11203203 Celiac disease or Rheumatoid arthritis A AA
one of two SNPs associated with increase Crohn’s
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "colorectal")) %>%
distinct()
## # A tibble: 79 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs72647… Colorectal cancer T TT
## 2 rs75426… Colorectal cancer C CC
## 3 rs10920… Metastasis in stage I-III microsatellite… T CT
## 4 rs66911… Colorectal cancer T GT
## 5 rs17011… Colorectal cancer or advanced adenoma G AG
## 6 rs66877… Colorectal cancer G AG
## 7 rs885036 Progression free survival in metastatic … A AG
## 8 rs21637… Colorectal cancer G GG
## 9 rs11626… Colorectal cancer G GG
## 10 rs651907 Colorectal cancer C CC
## # … with 69 more rows
filter(mySNPs, rsid == "rs16892766")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs16892766 8 117630683 AA
filter(mySNPs, rsid == "rs4779584")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4779584 15 32994756 CC
filter(mySNPs, rsid == "rs58920878")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
no increase of cholorectal cancer
filter(mySNPs, rsid == "rs4939827")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4939827 18 46453463 CC
0.73x decreased risk for colorectal cancer
I have been “diagnosed” with asthma, but I have never experienced an asthma attack or have experienced troubled breathing. Therefore, I wanted to see if I have any SNPs associated with asthma.
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "asthma")) %>%
distinct()
## # A tibble: 153 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs734999 Asthma C CT
## 2 rs301806 Allergic disease (asthma, hay fever or e… T CT
## 3 rs12932… Asthma G GG
## 4 rs22285… Asthma T GT
## 5 rs48456… Asthma G GG
## 6 rs41292… Asthma T TT
## 7 rs903361 Asthma A AG
## 8 rs903361 Asthma G AG
## 9 rs10174… Allergic disease (asthma, hay fever or e… G GG
## 10 rs232542 Asthma (time to childhood onset) x early… C CC
## # … with 143 more rows
filter(mySNPs, rsid == "rs1695")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1695 11 67352689 AA
normal asthma risk in certain populations
filter(mySNPs, rsid == "rs2303067")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
No asthma and atopic dermatitis SNP
filter(mySNPs, rsid == "rs4794067")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4794067 17 45808828 CC
2.1x risk for Aspirin Induced Asthma. But possibly lower risk of lupus and intractable Graves’ disease.
The following SNPs are all associated with increased asthma risk if exposed to smoke ~3x increased asthma risk if exposed to smoke
filter(mySNPs, rsid == "rs2305480")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs2305480 17 38062196 GG
~3x increased asthma risk if exposed to smoke
filter(mySNPs, rsid == "rs4795400")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
no SNP present
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "memory")) %>%
distinct()
## # A tibble: 10 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs726296… Short-term memory (digit-span task) C CC
## 2 rs9004 Logical memory (immediate recall) in no… T CT
## 3 rs112379… Verbal declarative memory T TT
## 4 rs110747… Verbal declarative memory T TT
## 5 rs429358 Logical memory (immediate recall) C CT
## 6 rs429358 Logical memory (delayed recall) C CT
## 7 rs429358 Age-related cognitive decline (memory) … C CT
## 8 rs4420638 Verbal declarative memory G AG
## 9 rs6046393 Verbal declarative memory T CT
## 10 rs1010304 Verbal declarative memory A AA
filter(mySNPs, rsid == "rs4680")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4680 22 19951271 GG
Warrior: Val, less exploratory, higher COMT enzymatic activity, therefore lower dopamine levels; higher pain threshold, better stress resiliency, albeit with a modest reduction in executive cognition performance under most conditions
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "nicotine")) %>%
distinct()
## # A tibble: 12 x 4
## rsid DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <chr> <fct>
## 1 rs1060061 Nicotine dependence T CT
## 2 rs4668485 Nicotine dependence symptom count T CT
## 3 rs9379896 Nicotine dependence symptom count C CC
## 4 rs62392942 Nicotine dependence T TT
## 5 rs4132568 Nicotine glucouronidation A AA
## 6 rs11763343 Nicotine dependence symptom count A AG
## 7 rs4285401 Nicotine use A AG
## 8 rs7385760 Nicotine dependence symptom count T CT
## 9 rs10828623 Nicotine dependence symptom count T CT
## 10 rs16969968 Fagerstr**m test for nicotine dependen… G AG
## 11 rs8075300 Nicotine dependence symptom count C CC
## 12 rs2836823 Nicotine dependence T CT
filter(mySNPs, rsid == "rs3750344 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
filter(mySNPs, rsid == "rs1051730 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
No nicotine dependence alleles
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>%
filter(str_detect(tolower(DISEASE.TRAIT), "drug")) %>%
distinct()
## # A tibble: 18 x 3
## DISEASE.TRAIT risk_allele your_geneotype
## <chr> <chr> <fct>
## 1 Medication use (drugs used in diabetes) T CT
## 2 QT interval (drug interaction) T CT
## 3 Cough in response to angiotensin-converting enzym… C CT
## 4 Drug-induced Stevens-Johnson syndrome or toxic ep… C CC
## 5 Drug-induced Stevens-Johnson syndrome or toxic ep… G GG
## 6 Drug-induced Stevens-Johnson syndrome or toxic ep… A AA
## 7 Adverse response to chemotherapy (neutropenia/leu… A AG
## 8 Adverse response to chemotherapy (neutropenia/leu… G AG
## 9 Liver injury in anti-tuberculosis drug treatment A AA
## 10 Medication use (drugs used in diabetes) A AG
## 11 Medication use (drugs used in diabetes) G GT
## 12 Medication use (drugs used in diabetes) T TT
## 13 QT interval (drug interaction) A AG
## 14 Adverse response to chemotherapy (neutropenia/leu… C CT
## 15 Adverse response to chemotherapy (neutropenia/leu… T CT
## 16 Adverse response to chemotherapy (neutropenia/leu… G GG
## 17 Illicit drug use G AG
## 18 QT interval (drug interaction) A AA
Ashkenazi related alleles
filter(mySNPs, rsid == "rs11209026")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs11209026 1 67705958 GG
higher risk for certain autoimmune diseases.
filter(mySNPs, rsid == "rs11209026")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs11209026 1 67705958 GG
higher risk for certain autoimmune diseases.
filter(mySNPs, rsid == "rs386833395")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs386833395 17 41276045 II
filter(mySNPs, rsid == "rs80357906")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs80357906 17 41209083 DD
no BRCA1 variants
filter(mySNPs, rsid == "rs80359550")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs80359550 13 32914438 II
no BRCA2 variant
not a carrier for cyctic fibrosis
filter(mySNPs, rsid == "rs121965064")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs121965064 4 187201412 TT
filter(mySNPs, rsid == "rs373297713")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
filter(mySNPs, rsid == "rs121965063")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs121965063 4 187195347 GG
not a carrier of hemophilia C (1/23 Ashkenazi are carriers
filter(mySNPs, rsid == "rs111033171")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
filter(mySNPs, rsid == "rs137853022")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
filter(mySNPs, rsid == "rs28939712")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
not a Familial dysautonomia carrier
filter(mySNPs, rsid == "rs333")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## # genotype <fct>
No resistance to HIV
filter(mySNPs, rsid == "rs662799")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs662799 11 116663707 AG
1.4x higher early heart attack risk; less weight gain on high fat diets
filter(mySNPs, rsid == "rs7495174")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs7495174 15 28344238 AA
blue/gray eyes more likely
filter(mySNPs, rsid == "rs12913832")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12913832 15 28365618 GG
blue eye color, 99% of the time
filter(mySNPs, rsid == "rs1799971")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1799971 6 154360797 AA
No stronger alcohol cravings
filter(mySNPs, rsid == "rs4988235")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4988235 2 136608646 AA
Can digest lactose
filter(mySNPs, rsid == "rs590787")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs590787 1 25629943 AG
Rh +. I knew I was type A, now I know Im A+
filter(mySNPs, rsid == "rs4675690")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4675690 2 208507807 TT
show less disgust
filter(mySNPs, rsid == "rs1015362")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1015362 20 32738612 CT
2-4x higher risk of sun sensitivity if part of risk haplotype.
filter(mySNPs, rsid == "rs4911414")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4911414 20 32729444 GT
2-4x higher risk of sun sensitivity if part of risk haplotype
filter(mySNPs, rsid == "rs12821256")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12821256 12 89328335 TT
no additional likelyhood of blonde hair
filter(mySNPs, rsid == "rs12203592")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12203592 6 396321 CT
likely presence of freckles, brown hair and high sensitivity of skin to sun exposure.
filter(mySNPs, rsid == "rs35264875")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs35264875 11 68846399 AT
one blonde variant
filter(mySNPs, rsid == "rs12896399")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12896399 14 92773663 TT
Lighter hair color & blue eyes more likely
filter(mySNPs, rsid == "rs1042522")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1042522 17 7579472 CC
Live 3 years longer. Chemotherapy is more effective.
filter(mySNPs, rsid == "rs6968865")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs6968865 7 17287269 TT
Associated with (slightly) increased coffee consumption
Drug Metabolism
filter(mySNPs, rsid == "rs4986893")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs4986893 10 96540410 GG
filter(mySNPs, rsid == "rs28399504")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs28399504 10 96522463 AA
filter(mySNPs, rsid == "rs41291556")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs41291556 10 96535173 TT
normal metabolizer of several commonly prescribed drugs
filter(mySNPs, rsid == "rs12248560")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs12248560 10 96521657 CT
ultra fast metabolizer of proton pump inhibitors and benefit from tamoxifen treatment; drug metabolism effects; also 0.77x decreased breast cancer risk
filter(mySNPs, rsid == "rs8099917")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs8099917 19 39743165 GT
Moderately lower odds of responding to PEG-IFNalpha/RBV treatment (Hepatitis C treatments)
filter(mySNPs, rsid == "rs1057910")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1057910 10 96741053 AC
average 40% reduction in warfarin metabolism (1/2 SNPs)
filter(mySNPs, rsid == "rs1800460")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1800460 6 18139228 CT
impaired capability of detoxifying byproducts of certain drugs (antineoplastic and immunosuppressant)
filter(mySNPs, rsid == "rs1800462")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1800462 6 18143955 CC
incapable of detoxifying certain drugs (antineoplastic and immunosuppressant)
filter(mySNPs, rsid == "rs1142345")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs1142345 6 18130918 CT
impaired drug metabolism (antineoplastic and immunosuppressant)
filter(mySNPs, rsid == "rs11212617")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs11212617 11 108283161 AC
Somewhat increased likelihood of treatment success with metformin (helps with diabetes which I have increase chance of)
filter(mySNPs, rsid == "rs2395029")
## # A tibble: 1 x 4
## rsid chromosome position genotype
## <chr> <fct> <int> <fct>
## 1 rs2395029 6 31431780 TT
no increase risk for drug-induced liver injury when prescribed flucloxacillin
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY)
## # A tibble: 23,142 x 5
## rsid your_genotype strongest_risk_a… DISEASE.TRAIT STUDY
## <chr> <fct> <chr> <chr> <chr>
## 1 rs1126… CT C IgG glycosylati… Loci associated wit…
## 2 rs2803… AA A Body mass index Meta-analysis of ge…
## 3 rs2803… AA A Body mass index Meta-analysis of ge…
## 4 rs4252… CT T Height Hundreds of variant…
## 5 rs1079… CT C Ulcerative coli… Host-microbe intera…
## 6 rs7349… CT C Ulcerative coli… Meta-analysis ident…
## 7 rs7349… CT C Asthma Genome-wide analysi…
## 8 rs3748… AG A Primary scleros… Genome-wide associa…
## 9 rs3748… AG A Primary scleros… Dense genotyping of…
## 10 rs3890… CT T Rheumatoid arth… Common variants at …
## # … with 23,132 more rows
datatable(
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0 & !is.na(RISK.ALLELE.FREQUENCY))) %>%
arrange(RISK.ALLELE.FREQUENCY) %>%
select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
trait_entry_count <- group_by(mySNPs_gwas_table, DISEASE.TRAIT) %>%
filter(have_risk_allele_count >= 1) %>%
summarise(count_of_entries = n())
ggplot(filter(trait_entry_count, count_of_entries > 100), aes(x = reorder(DISEASE.TRAIT, count_of_entries, sum), y = count_of_entries)) +
geom_col() +
coord_flip() +
theme_bw() +
labs(title = "Which traits I have the risk allele for\nthat have over 100 entries in the GWAS database?", y = "Count of entries", x = "Trait")
# Summarise proportion of SNPs for a given trait where you have a risk allele
trait_snp_proportion <- filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
mutate(you_have_risk_allele = if_else(have_risk_allele_count >= 1, 1, 0)) %>%
group_by(DISEASE.TRAIT, you_have_risk_allele) %>%
summarise(count_of_snps = n_distinct(rsid)) %>%
mutate(total_snps_for_trait = sum(count_of_snps), proportion_of_snps_for_trait = count_of_snps / sum(count_of_snps) * 100) %>%
filter(you_have_risk_allele == 1) %>%
arrange(desc(proportion_of_snps_for_trait)) %>%
ungroup()
## `summarise()` has grouped output by 'DISEASE.TRAIT'. You can override using the `.groups` argument.
trait_study_count <- filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
group_by(DISEASE.TRAIT) %>%
summarise(count_of_studies = n_distinct(PUBMEDID), mean_risk_allele_freq = mean(RISK.ALLELE.FREQUENCY))
trait_snp_proportion <- inner_join(trait_snp_proportion, trait_study_count, by = "DISEASE.TRAIT")
ggplot(filter(trait_snp_proportion, count_of_studies > 1 & proportion_of_snps_for_trait > 70), aes(x = reorder(DISEASE.TRAIT, proportion_of_snps_for_trait, sum), y = proportion_of_snps_for_trait, fill = mean_risk_allele_freq)) +
geom_col() +
coord_flip() +
theme_bw() +
labs(title = "Traits I have more than half of the risk\nalleles studied where > 1 studies involved",
y = "% of SNPs with risk allele", x = "Trait", fill = "Mean risk allele frequency")
datatable(trait_snp_proportion)
datatable(
filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0. & !is.na(RISK.ALLELE.FREQUENCY))) %>%
arrange(RISK.ALLELE.FREQUENCY) %>%
select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
filter(mySNPs_gwas_table, have_risk_allele_count == 2) %>%
select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
I should probably see a gastroenterologist regarding my already increased susceptibility of certain gastrointestinal conditions being Ashkenazi, but my testing results reaffirmed the fact that I should visit a gastroenterologist and most likely get further testing done. I was not aware that I was predisposed to type-2 diabetes which does not require mentioning to my medical provider but I should take into consideration in my lifestyle choices. Furthermore, what I discovered regarding drug and medication metabolism shocked me, I was not at all aware that I had so many SNPs associated with drug metabolism incapabilities. That is something I will most definitely inform my medical provider of. Additionally, purely for curiosity reasons I would like to have my mother genetically tested because her family history is such a mystery.